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Abstract. The framework of Bodlaender et al. (ICALP 2008) and Fortnow and Santhanam (STOC 
2008) allows us to exclude the existence of polynomial kernels for a range of problems under 
reasonable complexity-theoretical assumptions. However, there are also some issues that are not 
addressed by this framework, including the existence of Turing kernels such as the "kernelization" 
of Leaf Out BRANCHlNG(fc) into a disjunction over n instances of size poly(fc). 

Observing that Turing kernels are preserved by polynomial parametric transformations, we 
define a kernelization hardness hierarchy, akin to the M- and W-hierarchy of ordinary parameter- 
ized complexity, by the PPT-closure of problems that seem likely to be fundamentally hard for 
efficient Turing kernelization. We find that several previously considered problems are complete 
for our fundamental hardness class, including MiN Ones d-SAT(fc), Binary NDTM HALTiNG(fe), 
Connected Vertex CovER(fc), and CLiQUE(fclogn), the clique problem parameterized by fclogn. 

1 Introduction 

Parameterized complexity theory [18, 21] is concerned with whether problems can be solved in 
f{k) ■ n*^*-^-* time, where n is the total input size, k a parameter typically independent of the 
input size, and /() an arbitrary computable function. Such problems are called fixed parameter 
tractable, and FPT is the class of all fixed parameter tractable problems. In this sense, FPT 
extends the class P of tractable problems in classical complexity theory, allowing a refined 
analysis of hard computational problems. Complementing this notion of tractability is a set of 
classes of fixed parameter intractability, which allow classifying problems that are unlikely to 
be in FPT. These classes are organized into two main hierarchies, the W- and the M-hierarchy, 
which intertwine together to form an infinite hierarchy of intractability: 

FPT C M[l] C W[l] C M[2] C W[2] C • • • 

Arguably the most useful technique in parameterized complexity is kernelization. A ker- 
nelization algorithm (or kernel) is a polynomial-time reduction from a problem to itself that 
compresses any problem instance to an equivalent instance of f{k) size. Appropriately, the 
function /() is referred to as the size of the kernel. Not only is kernelization one of the most 
successful techniques for showing that a problem is fixed-parameter tractable, it also provides 
an equivalent way of defining fixed-parameter tractability: A problem is solvable in f{k) ■ n'^^^^ 
time iff it has a kernel [13]. Moreover, kernelization embodies within it the ubiquitous technique 
of preprocessing (data reduction), and gives the first natural and meaningful framework for 
analyzing this technique. 



Since any fixed parameter tractable problem has a kernel, it is natural to ask which problems 
admit particularly efficient kernels, which have traditionally been defined as kernels with poly- 
nomial size bounds. Problems admitting polynomial kernels form a natural subclass of FPT, 
and in fact (in the case of problems in NP) also of EXPT, the class of parameterized problems 
solvable in 2'^°*^'- n'^^^^ time. Examples of polynomial kernels are in abundance, including the 
linear kernels for Vertex Cover [34] and Planar Dominating Set [2], the quadratic kernel 
for Feedback Vertex Set [35], and the meta-theorems for kernelization on bounded genus 
graphs [7] (see also the surveys in [5,23]). 

In recent years there has been increasing study in lower bounds for kernelization, and in 
particular in determining which problems are unlikely to admit polynomial kernels. This research 
effort started with the work of Bodlaender et al. [6], which developed a machinery for excluding 
polynomial size kernels under the assumption that the polynomial hierarchy (PH) does not 
collapse, relying on a key lemma of Fortnow and Santhanam [22] . This machinery was used to 
show that problems such as Path(A;) and CLIQUE(tt;) (the classical clique problem parameterized 
by the treewidth of the input graph) do not have polynomial-size kernels unless PH collapses [6] . 
Extensions of this framework were not late to appear [8, 10, 15-17], and were used to exclude 
polynomial kernels for numerous problems, including Leaf Out Branching(A;) [20], Disjoint 
Cycles(A:) [10], Connected Vertex Cover(A;) [17], and several CSP problems [29,30], to 
name just a few. 

The above mentioned lower bound mechanisms thus proved very useful in determining which 
problems allow polynomial kernelizations. However, there are relaxed, yet still interesting no- 
tions of efficient kernelization for which these frameworks do not apply. In particular, they do 
not allow the exclusion of polynomial Turing kernelizations, where the kernel reduction is of 
Turing-type rather than of Karp-type. This is not merely a theoretical notion. For example, 
the Leaf Out BRANCHlNG(fc) problem, which as mentioned above has no polynomial kernel 
unless PH collapses, admits a kernel of size 0{k^) as soon as the root of the out-branching 
has been selected [20]. As another example, one might consider the Clique(z^\) problem, the 
classical clique problem parameterized by the maximum degree of the graph. Using e.g. [6], it is 
trivial to exclude the existence of a polynomial kernel for this problem, but simply making one 
initial selection of a vertex that is to be a member of the clique reduces the instance down to 
size A. Thus, both these problems have Turing kernels where an instance is transformed into a 
disjunction over n instances of small size. Clearly, in general one may also expect the existence 
of more involved Turing kernels for other problems. 

To obtain lower bounds for Turing kernelizations, we adopt a different approach than pre- 
viously considered. First of all, we observe that polynomial Turing kernels are preserved under 
so-called polynomial parametric transformations (PPTs), a type of reduction introduced by Bod- 
laender et al. [10] to exclude regular polynomial kernelizations. We then identify a hierarchy 
of problems in EXPT which we believe to be hard to kernelize, even under the relaxed notion 
of Turing reductions. These problems are reparameterizations of satisfiability problems used to 
define the W- and M-hierarchies. We then consider the hierarchies of classes defined by taking 
the PPT-closure of these problems, which leads to two hierarchies of classes: The WK-hierarchy 
and the MK-hierarchy. These hierarchies refine the class EXPT, intertwining together to form 
a tower of inclusions similar to the one formed by the W- and M-hierarchies: 

MK[1] C WK[1] C MK[2] C WK[2] C MK[3] C ■ ■ ■ C EXPT C FPT. 

The class MK[1] corresponds to problems with polynomial kernels. Thus, the fundamental 

hardness class of our hierarchies is WK[1], which plays the same role for kernelizability as W[l] 
plays for FPT-time algorithms. Similarly to the expectation that FPT 7^ W[l], there are strong 
reasons to believe that WK[l]-hard problems do not admit any form of efficient kernelizations. 
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In particular, if any WK[l]-hard problem admits a polynomial Turing kernelization, then so do 
all problems in WK[1], including: 

- The Binary NDTM Halting(/c) problem, which is the k-step halting problem for non- 
deterministic single-tape Turing machines with a binary tape alphabet, where k is taken 
as the parameter. This problem is PPT-equivalent to NDTM HALTiNG(/clogn), the fe-step 
halting problem for general single-tape Turing machines under the parameter k log n (where 
n is the total input size). A polynomial Turing kernel for any of these two problems implies 
polynomial Turing kernelizations for all problems in NP for which a witness can be verified 
(even non-deterministically) in time polynomial only in the witness size (as opposed to the 
total input size). 

- MiN Ones d-SAT(A;), where the parameter is taken to be the Hamming weight k of the 
solution, i.e. the number of variables set to true. Efficient kernelization for MiN Ones d- 
SAT(A;) implies efficient kernelization for all minimization problems for which consistency 
can be verified by local conditions, e.g. the ^-Free Edge MoDlFlCATlON(fc) problem, 
where Ti is a finite set of forbidden induced subgraphs, and the goal is to remove or add k 
edges in the input graph in order to obtain a graph with no induced subgraph in Ti [12]. 
Note that some of these problems are not immediately covered by the previous item. 

- Most natural W[l]-complete problems reparameterized under the parameter klogn. For 
example, the ubiquitous clique problem is WK[l]-complete problem under this parameter- 
ization. If one believes in the fundamental hardness of the clique problem, then it seems 
unlikely that a polynomial-time Turing reduction would reduce it to questions about graphs 
of size (A;logn)°W. 

- Several EXPT problems under the standard natural parameterization of solution size, which 
turn out to be complete for WK[1]. These include UNIQUE Coverage(/s), MULTICOLORED 
Path(/c), and CONNECTED Vertex Cover(/i;). 

The remainder of the paper is organized as follows. In Section 2 we give precise definitions for 
the central concepts used in this paper. Section 3 is then used to define our classes of inefficient 
kernelizability, namely the WK- and MK-hierarchies, and also to discuss basic properties of 
these hierarchies. The main technical body of our work is presented in Section 4, where we 
prove that several problems are complete for our fundamental hardness class WK[1], while in 
Section 5 we discuss problems that reside in higher levels of our hierarchies. We conclude the 
paper in Section 6 by posing some open problems. 

2 PreliminEiries 

We begin our discussion by formally defining some of the main concepts used in this paper, 
and by introducing some terminology and notation that will be used throughout. We use [n] to 
denote the set of integers {1, . . . , n} for any integer n > 1. 

Definition 1 (Kernelization). A kernelization algorithm, or, in short, a kernel /or a parame- 
terized problem L Q U* xN is a polynomial-time algorithm that on a given input {x, k) G U* x N 
outputs a pair {x\k') £ U* x N such that 

- {x,k) € L <^ {x', k') e L, and 

- \x'\ -\- k' < f{k) for some function /(). 

The function /() above is referred to as the size of the kernel. 

In other words, a kernel is a polynomial-time reduction from a problem to itself that compresses 
the problem instance to a size depending only on the parameter. If the size of a kernel for 
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L is polynomial, we say that L has a polynomial kernel. In the interest of robustness and 
ease of presentation, we relax the notion of kernelization to also allow the output to be an 
instance of a different problem. This has been referred to as a generalized kernelization [6] 
or bikernelization [3]. The class of all parameterized problems with polynomial kernels in this 
relaxed sense is denoted by PK. 

Definition 2 (Turing Kernelization). A Turing kernelization for a parameterized problem 
L C U* X N is a polynomial-time algorithm with oracle access to a parameterized problem L' 
that can decide whether an input {x,k) is in L using queries of size bounded by f(k), for some 
computable function /(). Again, the function /() is referred to as the size of the kernel. 

If the size is polynomial, we say that L has a polynomial Turing kernel. The class of all param- 
eterized problems with polynomial Turing kernels is denoted by Turing-PK. 

Definition 3 (Polynomial Parametric Transformations [10]). Let Li and L2 he two pa- 
rameterized problems. We write Li <ppt L2 if there exists a polynomial time computable func- 
tion ^ : {0, 1}* X N ^ {0, 1}* X N and a constant c G N, such that for all {x, k) e E* x N, if 
{x', k') = If (x, k) then: 

- {x,k) G Li {x',k') G L2, and 

- k' < ck". 

The function is called a polynomial parameter transformation (PPT for short). If Li <ppt L2 
and L2 <ppt Li we write Li =ppt L2. 

Lemma 1. Let Li, L2, and L3 he three parameterized problems. 

- If Li <ppt L2 and L2 <ppt L3 then Li <ppt L3. 

~ If Li l^ppt L2 and L2 G PK (resp. Turing-PK^ then Li G PK (resp. Turing-PKJ. 

For t > and d > 1, we inductively define the following classes Ft^^ and Af^^ of formulas 
following [21]: 

1^0,(1 •= {'^i A • • • A Ac : c G [d] and Ai, . . . , Ac are literals }, 

^o,d '■= {^1 V • • • V Ac : c G [d] and Ai, . . . , Ac are literals }, 

It+i^d := {AiG/ 5i : I is a finite non-empty index set and 5i G A^^d for all i G /}, 

^t+i,d '■= {\l i^ili : / is a finite non-empty index set and 7^ G 1*,^ for all i G I}. 

Thus, ^1,3 is the set of all 3-CNF formulas, and 72,1 is the set of all CNF formulas. Given a class <P 

of propositional formulas, we let <P^,<I'~ C ^ denote the restrictions of <P to formulas containing 
only positive and negative literals, respectively. For any given 0, we define two parameterized 
problems: 

- ^-WSAT(A; log n) is the problem of determining whether a formula (/> G ^ with n variables 
has a satisfying assignment of Hamming weight k, parameterized by k log n. 

- #-SAT(n) is the problem of determining whether a formula (j) e ^ with n variables is 
satisfiable, parameterized by n. 

In particular, we will be interested in /^i^(^-WSAT(fclogn) and ri^d-SAT(n). 
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3 The WK- and MK-Hierarchies 



In the following section we introduce our hierarchies of inefficient kernelizability, the MK- and 
WK-hier archies. For a parameterized problem L C Z"* x N, we let [i]<ppj denote the closure of 
L under polynomial parametric transformations. That is, [L]<^^^ := {L' C U* xN : L' <ppt L}- 

Definition 4. Let t > 1 he an integer. The classes WK[i] and MK[t] are defined by 

- WK[t] := U,eM[Arf-WSAT(felogn)]<^^,. 

The naming of the classes in our hierarchies comes from the close relationship of the MK- 
and WK-hierarchies to the M- and W-hierarchies of traditional parameterized complexity [21]. 
Roughly speaking, WK[t] and MK[t] are reparameterizations by a factor of logn of the tradi- 
tional parameterized complexity classes W[t] and M[t] (although W[t] and M[t] are closed under 
FPT reductions, which may use superpolynomial time in k). There are also close connections to 
the so-called subexponential time S-hierarchy (see [21, Chapter 16]); specifically, S[t] and MK[t] 
are defined from the same starting problems, using closures under different types of reduction. 

Yet another related hierarchy is the VC-hierarchy defined by Harnik and Naor [24]. The 
VC-hierarchy concerns a notion of instance compression which Harnik and Naor argue to be 
similar to but distinct from polynomial kernelization. Without going into technical details, we 
note that the problem Local Circuit SAT, which defines their class VCi, is WK[l]-complete 
under a parameter of k log n (see next section) , while the defining problem for VCj for t > 1 is 
trivially MK[t] -complete under a parameter of n. See [24] for all definitions. 

Theorem 1. Let t > 1. The following hold. 

- r{^^2-WSATiklogn) is WK[1]- complete. 

- r^7i-WSAT(A;logn) is WK[t]-com,plete for odd t > 1. 

- /^j'\-WSAT(A: log n) is WK[t]- complete for even t > 1. 

- rla-SAT{n) is MK[l]-complete for every d>3. 

- rt'i-SAT(n) is MK[t]-complete for t > 2. 

Theorem 1 above shows that the traditional problems used for showing completeness in the 
W- and M-hierarchies have reparameterized counterparts which are complete for our hierarchy. 
The theorem is proven using a set of PPTs from the specific class-defining problems to the 
corresponding target problem in the theorem. For the first item in the theorem, previous proofs 
have used FPT-time reductions; we provide a PPT. The remaining items are either easy or 
well-known. 

Lemma 2. Letd>l. Then ri,d-WSAT(A;logn) <ppt ri~2-WSAT(A;logn). 

Proof. The lemma is trivial for d = 1, so assume d > 2. We show the proof in four steps. First 
we transform our input formula into an anti-monotone F^^ formula. The anti-monotone formula 
is then transformed into a multicolored F^^ formula, then to a multicolored F^2 formula, and 
finally to an uncolored formula. Since the transformation at each step will be a PPT, this 
will prove the lemma. 

A,d-WSAT(A: log n) <ppt i~']~^-WSAT(fc log n): To transform to an anti-monotone formula, 
we adapt a trick used in [21]. Let (p he a i^i^^i- formula on variables X = {xi, . . . , x„}. We 
introduce a new set of variables Vijj', with the interpretation "the i'th true variable is Xj and 
the {i + l)'th true variable is Xj'" , taken over the ordering of X assumed above. We will convert 
the formula (j) into a formula (/)' using only the variables Y. By a combination of the Hamming 
weight condition and negative 2-clauses, we can enforce consistency among the y-variables. 
Specifically, it is easy to enforce the following structure on the solutions of the formula: 
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- At most one y- variable is true for every 1 < i < k. 

- If yjjij2 and yt+ij3j4 are both true, then ji < j2 = js < Ja- 

- If yiji j2 and j/i' j3 j4 are both true, and i' > i + then ji < j2 < js < j4- 

Thus, if there are k — 1 true y-variables, then these correspond to an ordered sequence of k 
variables xj. Let (p" be the formula containing these clauses. We can now replace every clause 
of by a conjunction of negative clauses, as follows. First, we may replace every literal Xj or 
-^Xj by a conjunction of negative literals of y- variables: 

jl>jd2>j i,jl<j,j2>j jl<j,j2<j 

-"Xj = A ^y^jj' ^ A ~^yk-i,r,j 

j' 

To see why these equalities hold, negate both sides of the equations; the result is an obvious 
description of xj and ^Xj as disjunctions over positive y-variables. Second, since every clause 
in (f) has bounded size, we may multiply out these conjunctions (using the distributive law of 
conjunctions) into individual d-clauses over literals ^Vijj'- Let (p' be the resulting formula. Then 
(/)' A (/)" is a /^^"^-formula and has a satisfying assignment of weight fc' = /c — 1 if and only if (p 
has a satisfying assignment of weight k. 

r{y^-WSAT(fclogn) <ppt Multicolored r]^^-WSAT(A;logn): Let {(f), k) be an instance 
of i~'j~^-WSAT(A; log n). We will produce an equivalent multicolored instance, where variables 
come in one of k colors, and the solution is required to contain exactly one variable of each color. 
This is easy since cp is anti-monotone: Create k copies of the entire variable set, each colored in 
a different color. For each variable Xi in <p, let Xi^c be the copy of Xj of color c. For every Xi and 
every pair of colors c 7^ c', add a clause (-■Xj^c V -■Xj^c')) ensuring that k distinct variables are 
chosen. Then, for every clause (-'.-riV. . .\/^Xd) in (p, replace it by one clause (-'Xi^ci V. • ■^~'^d,cd) 
for every set of distinct colors ci , . . . , q G [k]. Let cp' be the resulting formula. Note that each 
clause in <p excludes only a specific combination xi A . . . A x^, while the set of replacing clauses 
in (p' collectively excludes every possible selection of xi, . . . , x^ from different colors; thus <p' has 
a multicolored satisfying assignment of weight k iS (p has a satisfying assignment of weight k. 

Multicolored rj^^-WSAT(fclogn) <ppt Multicolored rj^2-WSAT(fclogn): Let cp be 
a multicolored Pj^^-formula on variables X, \X\ = n, and k colors. We create a multicolored 
r'f'g-formula cp' as follows: Let the colors of <p' correspond to d-tuples of colors from cp; thus <p' 
has k' = 0{k'^) colors. For every color C of 4>', corresponding to colors (ci, . . . , Cd) of (f>, create a 
set of variables as follows. Initialize with the set Xq = {(a^i, • • • , x^) : Xj € X, Xj has color q, 1 < 
i < d}. Then, remove from Xc every tuple (xi, . . . ,Xrf) which explicitly falsifies a clause of (p. 
Since (p is anti- monotone, this is simple, e.g., if all clauses of <p are d-ary, then (xi, . . . ,Xrf) is 
removed from Xc if and only if (p contains a clause {^xi V ... V -'X^). Let the remaining set 
of tuples be X^. Now, for every pair of color-tuples C and C which have at least one color 
in common, enforce consistency by excluding pairs of variable-tuples which disagree on some 
common color, using a conjunction of clauses with two negative literals each. Let the resulting 
instance be cj)' . It is clear that a multicolored satisfying assignment to cp' corresponds to a 
multicolored satisfying assignment for (j). Furthermore, since (p is anti-monotone, every clause 
has been "verified" in the construction of some set Xq. Thus (p' has a multicolored satisfying 
assignment of weight k' iff (p has a multicolored satisfying assignment of weight k. 

Multicolored rf2-WSAT(A;logn) <ppt rf2-WSAT(A;logn): Given a multicolored 
formula (p, we transform (p into an "uncolored" formula (p' by adding a conjunction of negative 
2-clauses over every color class of cp'. These ensure that exactly one variable of each class is set 
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to true in any satisfying assignment of (f)' . Thus, (j)' has a weight k satisfying assignment iff cf)' 
has a weight k satisfying assignment. □ 



The following follows from Flum and Grohe [21]. It can be readily verified that every reduc- 
tion used to prove [21, Lemma 7.5] is a PPT. 

Lemma 3 ( [21, Lemma 7.5]). Let d>l. Then: 

- rt,d-WSAT(A;logn) =ppt r-i-WSAT(felogn) for all oddt>l. 

- rt,d-WSAT(A;logn) =ppt r+^-WSAT{k\ogn) for all even t>l. 

Next, we give the reductions for the MK-hierarchy. 
Lemma 4. Let d>3. Then ri,d-SAT(n) <ppt ri,3-SAT(ra). 

Proof. Let (j) be a Fi^^ formula, i.e. a ti-CNF formula. By removing duplicate clauses, the total 
length of (/) is at most 0{n'^). Since d is a constant, the classical reduction to 3-CNF of Karp [28] 
(via clause splitting) is a PPT. □ 

Lemma 5. Let t > 2 and d > I. Then rt,d-SAT(n) <ppt rt,i-SAT(n). 

Proof. Let (f) he a If^^-formula. Introduce a new variable for every bottom-level disjunction or 
conjunction in (p of arity at most d. It is simple to enforce the values of the newly introduced 
variables using a /2,1-formula. Let (p' be such a formula, and let (p" be (p with every bounded-arity 
bottom-level conjunction or disjunction replaced by the corresponding new variable. Then (f) = 
(f)' A (/)" and the reduction creates at most 0{n'^) variables. □ 

This finishes the set of reductions needed to prove Theorem 1. We now proceed to show 
the class containments in our hierarchy. Let us denote by NP the class of such parameterized 

problems whose unparameterized variants are in NP. That is, for a parameterized problem 
L C i7* X N, we let L C (Z" U {#})* denote the unparameterized variant of L defined by 
Z := {a;#l'= : {x, k) G L}, where # is some symbol not in E. Then NP = {LCi7*xN:ZG NP}. 
We have the following relationship between PK and MK[1]: 

Lemma 6. PK n NP = MK[1]. 

Proof. Let L be a parameterized problem in PK H NP. Then L admits a polynomial kernel /C, 
and since L G NP, there is a reduction TZ from L to the unparameterized version of -ri^3-SAT(n). 
Composing K and TZ (by appropriately switching back and forth from instances of L to instances 
of L) gives a polynomial parameteric transformation from L to ri,3-SAT(n). Thus, L G MK[1], 
and so PK n NP C MK[1]. Conversely, an instance of A,d-SAT(n) has 0{n'^) distinct clauses, 
and so by removing duplicate clauses we get a polynomial kernel for I^i (i-SAT(n). Thus, A,d('^) 
G PK for every ci > 1, and by Lemma 1 we have MK[1] C PK. Furthermore, as MK[1] is defined 
as the PPT-closure of problems whose unparameterized problems are in NP, and since PPTs 
are a restricted form of classical polynomial-time reductions, we have MK[1] C NP. □ 

Lemma 7. MK[t] C WK[t] for any integer t>l. 

Proof. Let t > \. By Lemma 1 and Definition 4, to prove the lemma it suffices to show that 
rt^rf-SAT(n) <ppt r(_d+i-WSAT(A;logn) for every c/ G N. So fix d, and let a be an Ft^ formula 
on n variables given as an input to If^£i-SAT(n). Let xi, . . . ,Xn denote the variables of a. We 
construct a formula over the variables Xi,x\, . . . ,x^,x\. Let a' denote the formula obtained 
after replacing in a each positive occurrence of Xi with xj, and each negative occurrence of Xi 
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with x^. Then a' G Ft^d- For each i € [n], define /3j to be the formula /3i := (x-'VxJ) A(-ix^V-izJ), 
i.e., j3i = ^ x\^ and consider the formula a' A Then, as a' £ Ft^d and /3j € A, 2 for 

all i G [ra], this formula can be written as a /j^d/ formula ^, where d' = maxjd, 2} < d + 1. 
Moreover, /3 has a satisfying assignment of weight weight A; := n iff a is satisfiable. The lemma 
follows. □ 

Lemma 8. WK[i] C MK[t + 1] for any integer t>l. 

Proof. The lemma can be shown using the "A;logn trick" introduced by Abrahamson et al. [1]. 
Introduce k groups of log n variables, each group determining via binary expansion the identity 
of one selected variable. Since MK[t+l] may use formulas of depth one larger than WK[t], it 
is possible to essentially replace the literals in an input formula by expressions over the k log n 
new variables. See [21, Theorem 16.42] for a detailed construction. □ 

Lemma 9. MK[t], WK[t] C EXPT for all t > 1. 

Proof. Note that membership in EXPT is preserved by a PPT. Thus by Lemma 8 it suffices 
that ri^(i-SAT(n) G EXPT for every t and d, which is trivial by brute force. 

Combining Lemma 7, Lemma 8, and Lemma 9 we get the following tower of inclusions of 
hierarchies, refining the class EXPT. 

Theorem 2. (PK n NP) = MK[1] C WK[1] C MK[2] C WK[2] C MK[3] C • • • C EXPT. 

As we will see in the next section, we know that WK[1] ^ PK unless the polynomial 
hierarchy collapses, since several complete problems for WK[1] were previously shown not 
to have polynomial kernels under this assumption. We conjecture that none of these have 
polynomial Turing kernels as well. Thus, the main working hypothesis which this paper evolves 
around is the following: 

Conjecture: WK[1] ^ Turing-PK. 

4 Complete Problems for WK[1] 

In the following section we exemplify the robustness of WK[1], the fundamental hardness class of 
our hierarchy. We do this by showing that several natural problems are complete for this class. In 
doing so, we also establish that these problems are unlikely to admit polynomial Turing kernels. 
We begin in Section 4.1, by showing WK[1] -completeness for a handful of fundamental problems 
which will be used later in the reductions for other problems. Following this, in Section 4.2 we 
study two important satisfiability problems. Then, in Section 4.3, we review some classical 
path and cycle problems which were shown in [10] not to admit polynomial kernels unless PH 
collapses. Following this, in Section 4.4, we examine problems whose kernelizabilty status has 
been dealt with in [17]. For the sake of brevity, we defer all standard problem definitions to 
Appendix A. 

4.1 Basic Problems 

In this section we prove the following theorem, which covers some basic problems which will be 
convenient for showing WK[l]-hardness and completeness for other problems. 

Theorem 3. The following problems are all complete for WK[1].- 
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- CLIQUE(A:logn) and INDEPENDENT SET(fclogn). 

- Binary NDTM Halting(/c) and NDTM HALTiNG(A;logn). 

- Hitting SET(m) and Exact Hitting SET(m). 

- Set CovER(n) and Exact Set CovER(n). 

Note that the result for CLlQUE(A:logn) and Independent SET(A;logn) follows directly 
from the fact that both problems are known to PPT-equivalent to I^-|^2'"WSAT(A; log n) (see 
e.g. [21]), and from the fact that i^{"2-WSAT(fclogn) is WK[l]-complete (Lemma 3). We will 
also need a few annotated problems to be used in reductions. The following lemma may be 
considered folklore, although an explicit proof of its first part can be found in [19], and of the 
second part in [17]^. 

Lemma 10. The following equivalences hold. 

1. Multicolored CLiQUE(A;logn) =ppt CLiQUE(A;lognj. 

2. Multicolored Hitting SET(m) =ppt Hitting SET(m). 

We now proceed with the reductions. For many problems, we will find it most convenient to 
show hardness by reduction from EXACT HITTING SET(m) or HiTTiNG SET(m), and member- 
ship by reduction to a Turing machine problem; hence we begin by showing the completeness of 
these problems. We give a chain of reductions from Multicolored CLiQUE(A;logn) to Exact 
Hitting SET(m) to NDTM Halting (A; log n) to Binary NDTM Halting(A;), for which we 
finally show WK[l]-membership directly, closing the cycle; after this we will treat the Hitting 
SET(m) problem. 

Lemma 11. Multicolored Clique(A; log n) <ppt Exact Hitting SET(m). 

Proof. Let G be a graph on n vertices in an instance of Multicolored Clique(A: log n), and 
let c : V{G) — >■ [k\ the coloring function of G. We assume that V{G) = [n], and we let hi{v) 
denote the £'th bit in the binary expansion of G V{G). The instance (^7, F) of EXACT HiTTiNG 
SET(m) is constructed by taking U = E{G) and defining T to be the following subsets of U : 

- Fij := {uv e E{G) : c{u) = i, c{v) = j} for all 1 < i < j < k. 

- Fijj/^i := {uv £ E{G) : c{u) = i,c{v) = j,bi{u) = 1} U {uv G E{G) : c{u) = i,c{v) = 
j',be{u) = 0} for every pair of color pairs {i,j) and (i, /) with j ^ /, and all 1 < £ < logn. 

Observe that any exact hitting set U* Q U for {U,F) consists of (2) edges, one for each 
pair of distinct colors i,j G [k]. The sets Fijji£, 1 < i < logn, ensure that any pair of 
edges e ^ e' € U* with one endpoint colored i are incident with the same vertex v G V{G) 
with c{v) = i. Otherwise, if e is incident with an z-colored vertex v, and e' is incident with 
an i-colored vertex v' 7^ v, then bi{v) ^ be{v') for some £ G {1, ■ ■ • ,logn}, and e and e' are 
both in some Fijj/^g for this specific i. Thus, any exact hitting set of ([/, F) corresponds to a 
multicolored clique in G. Conversely, the edge-set of any multicolored clique in G is an exact 
hitting set of {U, F). Asm := \S\ = 0(A;^-|-A; log n), our construction is a polynomial parameteric 
transformation, and so Multicolored Clique(A; log n) <ppt Exact Hitting SET(m). □ 

Lemma 12. Exact Hitting SET(m) <ppt NDTM Halting (fe logn). 

^ Dom et al. [17] actually use the name Red Blue DOMINATING Set instead of Hitting Set, but the two 
problems are essentially the same. 
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Proof. Let {U,T) be an instance of ExACT HiTTiNG SET(m), with U = [n] and T = 
{Fi, . . . , F„i}. By identifying vertices which are included in the same set of edges, we may 
assume that n < 2™. We create a Turing machine M with alphabet [n], which writes down m 
values specifying the selected member of each set, followed by simple poly(m)-time verification. 
Specifically, let the m selections be ui, . . . , Um- Then the machine has to verify that Uj € Fi for 
each i € [m], and that for every pair i,j € [m], either Ui = uj or Uj € (Fi\Fj) and uj € {Fj\Fi). 
By encoding this information in the state space of M, we get a machine of size n' which is poly- 
nomial in n + m, that can nondeterministically verify in k = 0{m?) steps whether {U,T) has 
an exact hitting set. Since logn' = 0{m), this is indeed a PPT. □ 

Lemma 13. NDTM HALTiNG(A;logn) <ppt Binary NDTM Halting(A;), and both are mem- 
bers o/WK[l]. 

Proof. The reduction from NDTM HALTiNG(A;logn) to Binary NDTM Halting(A;) is well- 
known: The cells of the tape of the former machine M, with entries from i7, can be subdivided 
into log \E\ < log \M\ binary cells, and every transition of the machine subdivided into log 
steps corresponding to reading or writing the value of the original cell. Thus, it remains to show 
membership for Binary NDTM Halting(/c). 

This we do by a reduction to ri,3-WSAT(fc log n). Let (M, k) be the input to Binary NDTM 
HALTING(fc), where M is a Turing machine of size n with s states and i edges in its transition 
diagram. We construct a A,3 (3-CNF) formula cp, such that has a satisfying assignment of 
Hamming weight k', value to be chosen later, iff M accepts the empty string in k steps. For this 
we create variable groups as follows: 

- Variables Si^t, i G [s] and t G {0} U [fc], signifying that the machine is in state number i after 
t time steps. 

- Variables Mg.t, e G [£] and t G [fe], signifying that edge e of the machine's state diagram is 
followed as step t. 

- Variables i?p,t, < p,t < k, signifying that the machine's tape head is in position p after t 
time steps. 

- Variables Tp^j, < p,t < k, signifying that position p of the tape contains value 1 after t 
time steps. 

- Variables Tp^t, < p,t < k. These are constrained so that Tp^t / Tp^t for all values of p and 
t, allowing us to control the weight of any satisfying assignment for 

We add the following clauses to We first add clauses of size 2 to ensure that Tp^t Tp^t 
for every value of p and t. Then we enforce that at most one variable among S.^t, H. t, and M.^t 
is true for each t, by creating a conjunction of clauses of size 2 over the corresponding variable 
sets. We additionally enforce, by negative clauses of size 1, that the final state of the machine 
is an accepting state. We then add clauses enforcing consistency of states, head, tape, and 
transition variables. For example, if edge e moves from state i to state j, reading a 0, writing a 
1, and moving right, then wc have constraints: (Mg^f Ai7p^f_i — t- -^Tp^t-i), {Me,t^F[p^t-i — ^ ^p,t)) 
(Me,t A Hp^t-i Hp+i^t), iMe,t Si^t-i), and (Me,t Sj^t), for aU possible values of p and t. 
We complete the construction of ^ by adding clauses encoding {-'Hp^t {Tp,t = Tp^t+i)) for all 
p and t, and clauses controlling the initial setup of the machine {e.g. head position 0, state 1, 
all tape entries 0). 

As a final step of our reduction we set k' = 2{k + 1) -|- (fc -|- 1)^ + k. This accounts for the fact 
that exactly one variable among the S-, H-, and M-variables is set to true, and that the total 
Hamming weight of the T- and T- variables is {k + 1)^, which is the total number of bits needed 
to encode the information on the tape of M throughout k computation steps. The formula 
constructed above is satisfiable by an assignment of Hamming weight k' if and only if M accepts 
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the empty string in k steps. As Binary NDTM Halting(A:) can be solved in 2^ ■ rfi^^^ time, 
we may assume that logn < k, since otherwise the reduction is trivial. Thus, letting n' denote 
the number of variables of 0, we have k' log n' = k*^^^^ as needed, and the above construction is 
indeed a PPT. □ 

The remaining problems in Theorem 3 for which we need to show completeness are Hitting 
SET(m), Set CovER(ra), and Exact Set CovER(n). Since it is well known that Hitting 
SET(m) =ppt Set CovER(n) and Exact Hitting SET(m) =ppt Exact Set CovER(n), we 
finish the proof of Theorem 3 by showing WK[l]-completeness for Hitting SET(m). 

Lemma 14. Hitting SET(m) is WK[l]-complete. 

Proof. Hitting SET(m) can be shown to be in WK[1] by a similar argument used in Lemma 12. 
To show WK[l]-hardness, we reduce from Exact Hitting SET(m). Let {U,T) be an input 
instance to Exact Hitting SET(m), with T = {Fi,. . . ,Fm} and U := [n]. We can as- 
sume logn < fc by identifying identical vertices. Let V = Uig[m]{^*J • i ^ ^i}'^ ^^^^ ^^^^ 
the vertex set of our Hitting SET(m) instance. To ensure consistency between the selections 
in different sets, we consider the binary expansions of the vertices in U. Let G [m], i / i', 
and £ G [logn]. Let -^{i^i'/) consist of vertices ■Uj^- for every j G Fj fl Fj/ such that the ^'th bit 
of the binary expansion of j is 0, and consist of vertices Vj' j for every j G fl Fii such 

that the ^'th bit of the binary expansion of j is 1. Let C^^^ii) = {vij : j E Fi \ F^/}. We add 
the edge = ^{i,i',e) U U to Hitting SET(m) instance for all G [m], 

i ^ i' , and i G [logn]. Finally, for every i G [m], we add the edge Ei = {vi^j : j G Fj} to our 
output instance. 

Let S denote the resulting edge set, and set k = m. If {U,F) has an exact hitting set, then 
this immediately gives a hitting set for {y,£) of size m: For every pair of edges Fj and Fj/, 
either their intersection is hit, in which case exactly one of and j/^^) is hit for every ^, 

or the symmetric difference is hit in both sets, in which case C(i./j) is hit. On the other hand, 
assume that (F, £) has a hitting set of size m; then the edges Ei ensure that this hitting set 
corresponds to selecting one vertex from each edge F, G F. We argue that these vertices must 
form an exact hitting set for {U,T). Consider thus a pair of sets Fj and Fj/. Clearly, one vertex 
is selected in each edge. If both selected vertices are in Fj fl Fj/, but the vertices differ, then one 
of the edges Fj-j j/ or Fj-j/ j is not hit, where ^ is a bit distinguishing the binary expansions of 
these two vertices. If the vertex selected from Fj lies in Fj n Fj/, but the vertex selected from Fj/ 
lies in the symmetric difference of Fj and Fj/ , let ^ be a bit equalling 1 in the binary expansion 
of the vertex selected from Fj (this exists by construction). Then the edge F(j j/^) is not hit: 
Clearly, -B(i^i/^£) and C(j j/) are not hit, and by choice neither is j/^^-j. Thus, if the two selected 
vertices are two distinct vertices, both these vertices must lie in the symmetric difference of 
Fj and Fj/. This implies that for every pair of sets, the vertices selected from these sets are 
either identical vertices in the intersection, or two distinct vertices in the symmetric difference. 
It follows that the set of all selected vertices forms a valid exact hitting set for {U,F). □ 

4.2 Satisfiability Problems 

We now turn our attention to two important satisfiability problems. The first is MiN Ones 
(i-SAT(fc), which is the problem of determining whether a Fi (^-formula (d-CNF) has a sat- 
isfying assignment of weight at most k. This problem is similar to the class-defining Fx^^- 
WSAT(fclogn), except that as the problem is FPT (see [32]), we can simply parameterize by k. 
As (i is a constant, this can express in polynomial space any type of constraint involving at 
most d variables. For example, this implies that the "H-Free Graph Editing(A;) problem is in 
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WK[1] for every fixed set of forbidden induced subgraphs % (see [12]). The second satisfiabil- 
ity problem we deal with is LOCAL CIRCUIT SAT(A;logn) which defines the class VCi of the 
VC-hierarchy defined by Harnik and Naor [24]. 

Theorem 4. MiN Ones c?-SAT(A;) is WK[l]- complete for every d>3. 

Proof. MiN Ones d-SAT{k) is easily seen to be in WK[1] by a reduction to ri,d-WSAT(A; log n). 
Let (j) be the input formula with n variables. We note that we may assume logn < k, or else 
solve the problem in time polynomial in n using the 2<^('=)-time FPT algorithm [32]. Thus Mm 
Ones d-SAT(A;) reduces to MiN Ones d-SAT(A; logn) by PPT. Further, we can create a trivial 
formula which is satisfiable at any weight from to k, e.g. by adding k clauses, each containing 
two negative literals of variables not occurring in the rest of the formula. This gives the PPT 
to ri,d-WSAT(A;logn). 

To prove WK[l]-hardness, we show a PPT from Exact Hitting SET(m) which is WK[1]- 
hard by Theorem 3. Let {V,£) be an instance of Exact Hitting Set with \V\ = n and 

= m. First, we show that using binary trees, we can build selection formulas of polynomial 
size that force at least one member of each edge E & £ to he selected to a solution, while 
requiring only log \E\ true variables. Recall that we can assume \E\ < n < for each edge 
E e £. We construct a /i,3 formula cj) with n' = YlseS \^\ + variables as follows. For each 
edge E £ £, our formula (f) will contain |£'| variables, in additional to global variables associated 
with the vertex set V. 

Let E e £ and £ = [log \E\]. Build first a binary tree with root xq and where node Xi 
has children X2i+i and X2i+2, up to depth log|£?|. Create a clause (xq), and for every node Xi 
a clause (xj — ?• {x2i+i V X2i-\-2))- It is not hard to sec that this construction requires at least 
one node on every level of the binary tree to be true, and that it can be satisfied by setting 
the nodes along a path from the root to a leaf to be true. Next, for every leaf x' of the tree, 
add a clause {x' — >■ v) for some v e E, forming a surjective mapping from the leafs to vertices 
in E. This finishes the selection gadget for E, adding a cost of £ + 1 true variables among 
the Xi. All clauses used in the gadget contain at most 3 literals. Finally, we add a conjunction of 
clauses of size 2 with negative literals, to constrain the formula to contain exactly one variable 
corresponding to the elements of E, as required by the Exact Hitting Set problem. Create 
selection and constraint gadgets for every edge in £, and let k' = ^£;g^([log IE]] + 1) +m. The 
resulting formula is satisfiable if and only if (y,£) has an exact hitting set, and k' is clearly an 
upper bound on the possible number of true variables in a solution. □ 

The input to the LOCAL CIRCUIT SAT(A;logn) problem defined by Harnik and Naor [24] 
is a string of length n and a circuit C with k + k log n inputs, of size k + k log n. An instance 
is positive if there are k positions ii,. . . ,ik in the string such that feeding the contents of the 
positions to the first k inputs, and the binary expansions of ii, . . . , ifc to the remaining inputs, 
causes the circuit to accept. As Harnik and Naor note, we may equivalently assume the circuit 
to have size polynomially bounded in klogn, rather than exactly k + A; logn. 

Theorem 5. Local Circuit SAT(A;logn) is complete for WK[1]. 

Proof. To show WK[l]-hardness, we reduce from CLlQUE(A:logn) which is WK[l]-hard by The- 
orem 3. Let an instance {G,k) be given. Assume that n = 2^, or else pad the instance with 
isolated vertices (at most doubling the size). The input to the circuit is the adjacency matrix 
written row by row as a string of length n' = n^, modified to have a 1 in each diagonal entry 
(for ease of presentation). The circuit gets input from k"^ positions, with each position coded 
in logn' = 2 logn bits; here n = 2^ ensures that there is a trivial way of converting between 
matrix positions and places in the string. The first part of the string thus provides the circuit 
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with fe^ entries of the adjacency matrix, and it simply checks that these are all ones. The second 
part contains the position and the circuit checks that all the positions can be obtained by taking 
all combinations of concatenating the numbers of two vertices (size logn). For this we may 
fix any desired ordering of the positions, hence the checking comes down to hardwired equalities 
in an appropriate way (e.g. we might decide that the first k positions correspond to the first 
vertex hence the first logn bit-positions of the first k positions should be bitwise the same). It is 
easy to check the correspondence between a fc-clique in G and a choice of variable assignments 
such that the circuit accepts. 

To show membership in WK[1], we reduce to Binary NDTM Halting(A;) which is WK[1]- 
complete according to Theorem 3. The number of steps will be polynomial in klogn. It is well 
known that a Turing machine can be constructed to simulate a fixed circuit in a number of 
steps that is polynomial in the circuit size. To simulate an instance of Local Circuit SAT we 
additionally encode the input string of length n into the Turing machine. The machine guesses 
the k positions in k log n steps, writes the according contents of the string onto the tape, followed 
by the positions, and then simulates the circuit. □ 

4.3 Path and Cycle Problems 

We next prove hardness and completeness results with respect to WK[1] for well studied path 
and cycle problems. In particular, we prove the following two theorems: 

Theorem 6. The DISJOINT Paths (/c) and Disjoint Cycles (fc) problems are WK[l]-hard. 
Theorem 7. The following problems are com,plete for WK[1]: 

- Multicolored Path(/c) and Directed Multicolored Path(A;). 

- Multicolored Cycle(/c) and Directed Multicolored Cycle(A;). 

We note that the "uncolored" versions of both problems in Theorem 7 are in WK[1], but we 
do not know whether they are complete or not. Nevertheless, the above four problems are 
interesting in their own right, and algorithms for them are used as subroutines in the classical 
color-coding algorithms for their uncolored counterparts [4]. 

To prove both theorems above we go through an intermediate problem known as the Dis- 
joint Factors (fc) problem, and mimic the construction used in [10] to show that this problem 
is WK[l]-complete. The input to DISJOINT Factors(A;) is a string S of length n over the alpha- 
bet [k]. The goal is to determine whether there exists a set of k disjoint substrings Si, . . . ,Sk 
of S, where Si of the form i - ■ - i (i.e. a factor) for each i G [k]. Bodlaender et al. [10] show that 
this problem is solvable in 2*^^^) • n time, and thus is in EXPT. 

Lemma 15. The DISJOINT Factors(A;) problem is WK[l]-hard. 

Proof. We construct a PPT from Multicolored Hitting SET(m) which is WK[l]-hard ac- 
cording to Theorem 3. Given an instance {V, £, c, k) of MULTICOLORED HITTING SET(m), with 
\V\ = n, \£\ = m, and c : V ^ [k], we create a string S of size polynomial in n -|- [j^eS ^ 
instance of DisjointFactors as follows: Our alphabet will consist of one symbol e for every 
edge e G £, and of at most k ■ [log n] auxiliary symbols: For every color i G [k], we have [log jV^j] 
symbols a^, . . . aj-j^g^^i^, where Vi V is the subset of vertices with c{v) = i. Clearly we can 
identify vertices which are included in the same set of edges, and therefore we can assume that 
logn = 0{m), which makes our reduction a PPT. Our string S will contain substrings corre- 
sponding to vertices; to a vertex v E V contained in the edges Ei, . . . G £, wc assign the 
substring S^ = eiei . . . e^e^. The symbols corresponding to edges will appear only inside such 
substrings S {v) . For every color i £ [k], we will build a "selection gadget" for choosing a vertex 
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of the given color analogous to the one described in [10, Lemma 2]. The gadget will ensure that 
we are able to pick factors contained in S{v) if and only if v is the chosen vertex from its color 
class. 

Let us describe the selection gadget for color i G [k] more precisely. We assume that the 
number of vertices rij of this color is a power of 2 (otherwise we can just add isolated vertices), 
and wc let Vi = {vi, . . . ,Vn-}. If we had only two vertices vi and V2 to pick from, we could 
implement the gadget in the following manner: S'[l,2] = SLiS{vi)a\S{v2)B-\, where sl\ is some 
symbol which does not appear inside S{vi) nor S{v2)- By choosing the factor for a\, we prevent 
selecting factors from either S{vi) or S{v2), which corresponds to selecting whether to hit all 
sets which include vi , or all sets which include V2 ■ Wc can apply this construction in a recursive 
manner; if the substring S'[l,2-^] implements the selection of a vertex from {vi, . . . ,V2j}, and if 
S[2^ + 1, 2-^+^] implements the selection of a vertex from {^2^+1, • • • , V2J+1}, then the substring 
a^j^^S[l,2^]a^^^S[2^ + l,2^+^]s^j^^ selects a single vertex from {vi,... ,V2i+i} (note that we 
need only one auxiliary symbol per level of the recursion; symbols a^, . . . , a*- appear inside both 
^[l, 2-'] and S[2^ + 1, 2-'^^]). It is easy to check that the only way of selecting factors for every 
edge symbol e is selecting in the described gadgets substrings S{v) corresponding to vertices in 
a multicolored hitting set of size k for (V, £, c). The lemma thus follows. □ 

Bodlaender et al. [10] provide polynomial parametric transformations from Disjoint 
FACTORs(fc) to Disjoint CYCLEs(fc) and Disjoint PATHs(fc). This, in combination with the 
lemma above provides the proof for Theorem 6. To prove Theorem 7, wc show that Directed 
Multicolored Path(A;) and Multicolored Path(A;) are WK[l]-complete. The correspond- 
ing cycle problems in Theorem 7 follow immediately from this, and are thus omitted. 

Lemma 16. The Directed Multicolored PATH(fc) is WK.[l\- complete. 

Proof. It is easy to see that Directed Multicolored PATH(fc) G WK[1], by reducing this 
problem to NDTM HALTlNG(A;logn). Let G be a directed fc-colored graph on n vertices given 
as input to Directed Multicolored PATH(fc). First note that as Directed Multicolored 
Path(A;) can be solved in 2'^^^^ ■ mP^^^ time [4], we can assume logn = 0(k). We construct a 
Turing machine M that encodes within its state-space the adjacency matrix of G. It is easy 
to see that such a Turing machine can be programmed to determine non-deterministically in 
0{k) steps whether G has a multicolored path on k vertices, by guessing k vertices of different 
colors in G, and then checking whether these vertices form a path. Since |M| = O(n^) and 
logn = 0(/c), this gives the desired PPT. 

To show hardness, we reduce from Disjoint Factors (A;). Given an input string S to Dis- 
joint Factors (fc) over the alphabet [k], we construct a directed fc-colored graph G which has 
a vertex corresponding to each factor of S. Each vertex is colored according to the starting (or 
ending) letter of its corresponding factor, and there is an edge {u^v) in G if the factor corre- 
sponding to u is strictly to the left of the factor corresponding to v in S. It is easy to verify that 
G has a multicolored path of length fc iff 5 has k disjoint factors. Thus, Disjoint Factors(/c) 
<ppt Directed Multicolored PATH(fe), and the lemma is proven. □ 

Lemma 17. Multicolored Path(A;) is WK[\\- complete. 

Proof. The argument showing that MULTICOLORED PATH(fc) € WK[1] is very similar to the 
one used for Directed Multicolored Path(A:). To show hardness, we provide a PPT from 
Directed Multicolored Path (A;). Let G be a directed A;-colored graph on n vertices given 
as input to Directed Multicolored PATH(fc). We construct a fc'-colored graph G' on 0{n) 
vertices with k' = 3/c + 4. First, wc split each vertex v of color c in G into three vertices: Vin, Vmidi 
and Vout, of colors Cj^, Cmid aiid Cput respectively, forming a path Vin, v^nid^ Vout in G'. A directed 



14 



edge {u,v) in G will be transformed to an edge {uout,Vin} in G' ■ We add to G' four additional 
vertices Sin, Sout,tin and tout, and assign to them four unique colors. The source vertex Sj„ will 
be connected only to Sout, and analogously the sink vertex tout will be connected only to tin. 
We connect Sout to every vertex Vin, v G V{G), and we connect tin to every Vout, v G V{G). 

Note that a multicolored path of length k' in G' must have Sj„ and tout as its endpoints: 
These vertices are the only representatives of their color classes, and therefore have to appear 
on the path, and both have degree exactly one, which means that they can only be endpoints of 
the path. Another property of G' is the fact that any multicolored path which visits one of the 
vertices Vin-, Vmid or v^it has to visit the other two vertices as well in a consecutive manner, which 
can be proven by a straightforward case analysis. A path starting at ,Sj„ will therefore always 
follow the directed edges of G in the "right direction": When arriving at ViU via some incoming 
edge it will proceed to Vmid and Vout and then take some outgoing edge. This observation shows 
that G has a path of length k iff G' has a path of length k' . Since our construction is clearly a 
PPT, the lemma is proven. □ 



4.4 Further Problems 

In this section we investigate WK[l]-completeness and hardness of various problems for which 
kernelization lower bounds were obtained by Dom, Lokshtanov, and Saurabh [17]. 

Theorem 8. The following problems are complete for WK[1].' 

- Connected Vertex CovER(fc). 

- Capacitated Vertex Cover(A;). 

- Steiner TREE(A; + t). 

- Small Subset SuM(fc). 

- Unique Coverage(/c). 

The first four problems in the theorem above all have polynomial parameteric transforma- 
tions from Hitting SET(m) [17, Theorems 2 and 7], and are thus WK[l]-hard by Lemma 14. 
To see that Unique Coverage(A;) is also WK[l]-hard, consider the following easy PPT from 
Exact Hitting SET(m). Let iy,£) be the input instance with \£\ = m, and recall that we 
can assume as usual that logn < m. For each v V make a set containing all sets E G S 
with V € E. It is easy to see that any set J^' C {F^ : v G V} which uniquely covers k = m 
elements of F directly corresponds to an exact hitting set for Thus, all problems in 

Theorem 8 are WK[l]-hard, and so to complete the proof of theorem we show that all these 
problems are members of WK[1]. 

For all five problems, membership in WK[1] is shown by PPTs to NDTM Halting (fe log ra). 
In all cases the input will be included in the description of the Turing machine to allow for a 
fast verification of guessed solutions. For Steiner TREE(A; + i) and Small Subset Sum(A;) the 
PPTs are straightforward. The PPT for UNIQUE Coverage(A;) is also straightforward once we 
realize that it sufficient to consider a solution of size at most k, and that the 0(4^)-size kernel 
of Moser et al. [33] lets us assume that the logarithm of the input size is polynomially bounded 
in k. We thus state the following lemma without proof. 

Lemma 18. Steiner Tree(A; + i), Small Subset Sum(/c), and Unique Coverage(A;) all 
have PPTs to the NDTM HALTlNG(A;logn) problem. 

For the two remaining vertex cover variants in Theorem 8 the reductions are a bit more 
subtle. Both of them utilize the so-called Buss rule used in the classical 0(/c^) kernel for VER- 
TEX Cover(/c) [11]. This allows the output Turing machine in the reduction to quickly verify 
solutions that it guesses. More details are given in the proofs of the two lemmas below. 
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Lemma 19. Connected Vertex Cover(A;) <ppt NDTM HALTiNG(A;logn). 

Proof. Given an instance {G, k) of Connected Vertex Cover(A;), we first identify the set T 
of all vertices of degree greater than k, and output a Turing machine that never halts if |r| > A; 
since the vertices of T must be in every vertex cover of size k for G. The remaining budget k' = 
k — \T\ must be spend on a set N of vertices which covers all edges not incident on T and such 
that G[T U N] is connected. Since all vertices outside T have degree at most k there can be 
at most (2) uncovered edges, or we may output a machine that never halts (as vertices in N 
cover at most k\N\ edges altogether). We construct a Turing machine M by encoding in its 
state-space the set of at most (2) uncovered edges, the set T, the graph G, and the budget k'. 
It is now not difficult to see that by properly programming M, the machine will determine in 
non-deterministic steps whether G has a connected vertex cover of size k. □ 

Lemma 20. Capacitated Vertex Cover(A;) <ppt NDTM HALTiNG(A;logra). 

Proof. Let {G,a,k) be an instance of Capacitated Vertex Cover(/c), where G = {V,E) is 
a graph on n vertices. We again identify the set T of vertices with degree exceeding k in G, 
and output a non-halting machine in case \T\ > k. We also output a non-halting machine if the 
capacity of some vertex in T is lower than the number of its incident edges by more than k; 
conversely we may delete a vertex of T if its capacity suffices for all incident edges (decreasing k 
by one). By selecting a set N of k — \T\ further vertices at most fc^ edges can be covered; this 
includes edges not incident with T but also edges incident with some vertex of T for which the 
budget does not suffice. The output Turing machine M is hardwired with an encoding of the set 
T along with the input graph, including the degrees and capacity of all vertices, to essentially 
allow random access to all these values. The machine M proceeds as follows: 

1. Guess a set N of at most k' = k—\T\ vertices. Verify that all edges have at least one endpoint 
in T U A'^ (this only has to be done for the at most k'^ edges not incident to T). 

2. For every edge with both endpoints in T U A'^, guesses the vertex which covers it. Register 
on the tape for every vertex in TUN the number of incident edges that the vertex does not 
have to cover (due to it being covered at its other endpoint). 

3. Verify for every vertex in T U that its capacity suffices to cover all remaining edges, that 
is, that its capacity is at least its degree minus the number of edges covered by another 
vertex. 

It can be verified that the required number of steps can be bounded by a polynomial in k and 
that the total machine size and construction time are polynomial in n. □ 

5 Problems in Higher Levels 

In this section we investigate the second level of the MK- and WK-hier archies, and present some 
complete and hard problems for these classes. 

5.1 MK[2] 

According to Theorem 1, MK[2] is the PPT-closure of the classical CNF satisfiability problem 
where the parameter is taken to be the number of variables in the input formula. The PPT- 
equivalence of this problem to Hitting SET(n) and Set CovER(m) is well known. 

Theorem 9. Hitting SET(n) and Set CovER(m) are complete for MK[2]. 
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Heggernes et al. [25] consider the problems Restricted Perfect Deletion(|X|) and 
Restricted Weakly Chord al Deletion([X|), where the input is a graph G, a set of i 
vertices X oi G such that G — X \s perfect (respectively weakly chordal), and an integer k, and 
the task is to select at most k vertices 5 C X such that G — S \s perfect (respectively weakly 
chordal). The following corollary is immediate from Theorem 9 and PPTs given in [25]. 

Corollary 1. Restricted Perfect Deletion (£) and Restricted Weakly Chordal 
Deletion(^) are hard for MK[2]. 

5.2 WK[2] 

The following theorem establishes WK [2] -completeness for the following reparameterizations of 
well-known W [2] -complete problems. 

Theorem 10. The following problems are complete for WK[2]; 

- Hitting SET(/slogn) and Set CovER(/clogm). 

- Dominating SET(/clogn) and Independent Dominating SET(/jlogn). 

- Steiner TREE(/clogn) 

From Theorem 10, we immediately get the following corollary via PPTs by Lokshtanov [31] and 

Heggernes et al. [25]. 

Corollary 2. The following problems are all hard for WK[2]; 

- Wheel-free DELETiON(/clogn). 

- Perfect Deletion (fc log ra). 

- Weakly Chordal Deletion (A; log n). 

For the first four problems in Theorem 10, the results follow easily. The PPT-equivalence 
between r2,i-WSAT(fclogn), Hitting SET(A;logn), Set CovER(felogm), and Dominating 
SET(A;logn) arc well known, and for INDEPENDENT DOMINATING SET(/clogn), a PPT to r2,i- 
WSAT(/clogn) is trivial and a PPT from Dominating SET(/jlogn) can be produced by stan- 
dard methods. 

The story is different with Steiner TREE(A;logn). While WK [2] -hardness for this prob- 
lem follows immediately from e.g. the PPT from Hitting SET(A;logn) given in [17], showing 
membership in WK[2] is more challenging. To facilitate this and other non-trivial membership 
proofs, we consider the issue of a machine characterization of WK[2], similarly to the WK[1]- 
complete Binary NDTM Halting(A;) problem. The natural candidate would be Multi-tape 
NDTM HALTiNG(A;logn), as this same problem with parameter k is W[2]-complctc [14]. How- 
ever, while the problem with parameter k\ogn is easily shown to be WK[2]-hard, wc were so 
far unable to show WK[2]-membership. On the other hand, we define the following extension 
of a single-tape non-deterministic Turing machine problem which is WK[2]-complete. We name 
the corresponding halting problem NDTM Halting with Flags. 

Definition 5. A (single-tape, non-deterministic) Turing machine with flags is a standard 
(single-tape, non- deterministic) Turing machine which in addition to its working tape has access 
to a set F of flags. Each state transition of the Turing machine has the ability to read and/or 
write a subset of the flags. A transition that reads a set S F of flags is only applicable if all 
flags in S are set. A transition that writes a set S C. F of flags causes every flag in S to be set. 
In the initial state, all flags are unset. Note that there is no operation to reset a flag. 

Theorem 11. NDTM Halting with FLAGs(felogn) is WK[2]- complete. 
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Proof. Showing WK[2]-hardness is easy by reduction from HiTTING SET(fclogn). In fact, the 
hitting set instance can be coded directly into the flags, without any motion of the tape head 
- simply construct a machine that non-deterministically makes k non-writing transitions, each 
corresponding to including a vertex in the hitting set, followed by one verification step. The 
machine has m flags, one for every set in the instance, and a step corresponding to selecting a 
vertex v activates all flags corresponding to sets containing v. Finally, the step to the accepting 
state may only be taken if all flags are set. By assuming logm < klogn (or else solving the 
instance exactly) we get a PPT. 

Showing membership in WK[2] can be done by translation to /2,i-WSAT(/clogn). The 
transition is similar to that in Lemma 13. The only complication is to enforce consistency of 
transitions which read and write sets of flags, but this is easily handled. Let Me,t signify that step 
number t of the machine follows edge e of the state diagram (as in Lemma 13). If transition e 
has a flag / as a precondition, then we simply add a clause 

(-Me,t V Me,^,i V ... V Me,^,i V ... V M^.^^t-l V ... V Me,„,t_i), 

where e^^, . . . , e^^ is an enumeration of all transitions in the state diagram which set flag /. The 
rest of the reduction proceeds without difficulty. □ 

Lemma 21. Steiner TREE(fclogn) is WK[2]' complete. 

Proof. As mentioned above, WK[2]-hardness for Steiner TREE(fclogn) follows from the PPT 
from Hitting SET(A:logn) given in [17]. We show membership in WK[2] by a reduction to the 
Turing machine problem with flags. Let (G, T, k) be an instance of Steiner Tree. We make 
the following observations. 

1. If two terminals t,t' eT are neighbors in G, they may be identified. 

2. Assume that no two terminals are neighbors in G. Let G' be G with N(t) replaced by a 
clique for every t gT. Then, for any solution S, the graph G'[S] must be connected. 

In fact, a set S* is a Steiner tree if and only if G'[S] is connected and every t G T is neighbor to 
a vertex in S. Thus, perform the reduction to G' as described. The Turing machine then goes 
in two phases. First, it guesses the solution S consisting of k vertices, and checks (in poly(A;) 
time) the connectivity of G'[S']. Second, using the flags as in Theorem 11, it goes through the 
vertices of S and verifies that every terminal is neighbor to at least one vertex. It accepts if 
both tests pass. □ 

6 Discussion 

We have defined a hierarchy of classes of inefficient kernelization, akin to the M- and W-hierarchy 
of parameterized intractability. The fundamental distinction in the new hierarchy is between 
problems admitting polynomial Turing kernels (Turing-PK) and WK[l]-hard problems. This 
distinction does not seem to be addressable by previous lower-bound frameworks, as evidenced 
by problems such as Leaf Out BRANCHiNG(fc) and Clique(Z\), which admit polynomial 
Turing kernels but no standard polynomial kernels unless the polynomial hierarchy collapses. 
Thus, WK[1] ^ Turing-PK is a new conjecture in kernelization theory. We showed that many 
natural parameterized problems to which the kernelization lower bounds apply are WK[1]- 
complete, indicating that the class is natural. Of course, our examples provide only a partial 
image of the WK[1] landscape. For example, the various kernelizability dichotomies that have 
been shown for CSP problems [29, 30] can be shown to imply dichotomies between problems 
with polynomial kernels and WK[l]-complete problems (and in some cases the third class of 
W[l]-hard problems). 
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Still, several questions remain. One is the WK[l]-liardness of Path(A;) and Cycle(A;); for 
these problems, we have only lower bound proofs in the framework of Bodlaender et al. [6], 
leaving the question of Turing kernels open. There are also several problems, including the 
work on structural graph parameters by Bodlaender, Jansen, and Kratsch [9,26,27], which we 
have not investigated. It is also unknown whether MULTI-TAPE NDTM HALTiNG(fc log n) is 
in WK[2]. Furthermore, it would be interesting to know some natural parameterized problems 
which are WK[2]-complete under a standard parameter (e.g., k rather than felogn). On the 
more structural side, we have noted several connections between our hierarchy and previous 
hierarchies. It would be interesting if these could be made stronger. In particular, we would 
like to know if there are (classical or parameterized) complexity theoretical implications of 
polynomial Turing kernelizations for WK[1]. 
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A Problem Zoo 



Below we provide problem statements to all problems discussed in the paper. We adopt the 
notation that appends brackets at the end of problem names to specify the parameterization 
used for the specific problem. For instance, Connected Vertex Cover(A;) denotes the 
Connected Vertex Cover problem parameterized by the number k of vertices in the 
solution. 

^-SAT : 

Input: A formula (f) (z with n variables, and an integer k. 
Task: Decide whether is satisfiable. 

^-WSAT : 

Input: A formula € <P with n variables, and an integer k. 

Task: Decide whether (p is satisfiable by an assignment of Hamming weight k (an assignment 
that assigns exactly k variables the boolean value 1). 

Binary NDTM Halting : 

Input: A Turing machine M of size n with a binary alphabet, and an integer k. 
Task: Decide whether M halts on the empty string in k steps. 

Clique : 

Input: A graph G with n vertices, and an integer k. 

Task: Decide whether G has a clique of size k (a pairwise adjacent subset of k vertices). 

Capacitated Vertex Cover : 

Input: A graph G with n vertices, a capacity function a : V{G) N, and an integer k. 
Task: Decide whether G has a capacitated vertex cover of size k (a subset of k vertices S that are 
incident with each edge G, and such that each vertex v E S is incident with at most a{v) edges). 

Connected Vertex Cover : 

Input: A graph G with n vertices, and an integer k. 

Task: Decide whether G has a connected vertex cover of size k (a connected subset of k vertices 
S that are incident with each edge of G). 

Directed Multicolored Cycle : 

Input: A directed graph G, a coloring function c : V ^ [k], and an integer k. 

Task: Decide whether G has a multicolored directed cycle of length k (a directed cycle which 

includes exactly one vertex from each color). 

Directed Multicolored Path : 

Input: A directed graph G, a coloring function c : V —?■ [k], and an integer k. 

Task: Decide whether G has a multicolored directed path of length k (a directed path which 

includes exactly one vertex from each color). 
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Disjoint Cycles : 

Input: A graph G with n vertices, and an integer k. 
Task: Decide whether G contains k pairwise disjoint cycles. 

Disjoint Factors : 

Input: A n-charactcr string S over the alphabet [k]. 

Task: Decide whether there exists a set of k non-overlapping substrings Si, ... ,8^ of S such 
that Si is of the form i - ■ - i for every alphabet symbol i E [k]. 

Disjoint Paths : 

Input: A graph G with n vertices, and k pairs of vertices . . . , {sk,tk). 

Task: Decide whether G contains k pairwise disjoint paths connecting Sj to ti for all i G [/s]. 

Dominating Set : 

Input: A graph G with n vertices, and an integer k. 

Task: Decide whether G has a dominating set of size k (a set D of k vertices for which every 
vertex not in D has a neighbor in D). 

Exact Hitting Set : 

Input: A hypergraph {V,£) with \V\ = n and \£\ = m. 

Task: Decide whether {V,£) has an exact hitting set (a subset S ^ V such that IS fl = 1 
for all Ee£). 

Exact Set Cover : 

Input: A hypergraph {V,£) with \V\ = n and \£\ = m. 

Task: Decide whether (V, £) has an exact set cover (a subset S C £ of pairwise disjoint edges 
with \JS = V). 

Independent Set : 

Input: A graph G with n vertices, and an integer k. 

Task: Decide whether G has an independent set of size k (a pairwise non-adjacent subset of k 
vertices). 

Hitting Set : 

Input: A hypergraph {V,£) with |y| = n and \£\ = m, and an integer k. 

Task: Decide whether G has a hitting set of size k (a subset S QV of size k with S P[E ^ $ 

for all Ee£). 

Local Circuit SAT : 

Input: A circuit C over k + k logm variables and of size k + k\og m, and a string S. 
Task: Decide whether there is a list of k positions ii,... ,ik in S such that feeding the contents 
of the positions to the first k inputs, and the binary expansions of ii, . . . ,ik to the remaining 
inputs, causes C to accept. 
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MiN Ones d-SAT : 

Input: A formula € Fi^d with n variables, and an integer k. 

Task: Decide whether (p is satisfiable by an assignment of Hamming weight at most k. 
Multicolored #-WSAT : 

Input: A formula ^ G # over a variable set X of size n, a coloring function c : X ^ [k], and 
an integer k. 

Task: Decide whether ^ is satisfiable by an multicolored assignment of Hamming weight k (an 
assignment where no two variables of same color are assigned a 1). 

Multicolored Clique : 

Input: A graph G = (V, E) with \V\ = n, a coloring function c : F — >■ [k], and an integer k. 
Task: Decide whether G has a multicolored clique of size k (a clique containing exactly one 
vertex of each color). 

M ulticolored Cycle : 

Input: A graph G, a coloring function c : V ^ [k], and an integer k. 

Task: Decide whether G has a multicolored cycle of length k (a cycle which includes exactly 
one vertex from each color). 

Multicolored Hitting Set : 

Input: A hypergraph {V,S) with \V\ = n and \£\ = m, a coloring function c : V [k], and an 
integer k. 

Task: Decide whether G has a multicolored hitting set of size k (a hitting set which includes 
exactly one vertex from each color). 

Multicolored Path : 

Input: A graph G, a coloring function c : V [k], and an integer k. 

Task: Decide whether G has a multicolored path of length k (a path which includes exactly 
one vertex from each color). 

NDTM Halting : 

Input: A Turing machine M of size n, and an integer k. 
Task: Decide whether M halts on the empty string in k steps. 

Perfect Deletion : 

Input: A graph G on n vertices, and an integer k. 

Task: Decide whether G has at most k vertices S such that G — S is perfect. 
Restricted Perfect Deletion : 

Input: A graph G on n vertices, a set of i vertices X of G such that G — X is perfect, and an 

integer k. 

Task: Decide whether G has at most k vertices S C X such that G — S is perfect. 
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Restricted Weakly Chordal Deletion : 

Input: A graph G on n vertices, a set of ^ vertices X of G such that G — X is weakly chordal, 
and an integer k. 

Task: Decide whether G has at most k vertices S Q X such that G — S \s weakly chordal. 

Set Cover : 

Input: A hypergraph {V^£) with \V\ = n, \€\ = m, and max^gg \E\ = d. Also, an integer k. 
Task: Decide whether {V, £) has a set cover of size k (a subset S Q£ oik edges with \JS = V). 

Small Subset Sum : 

Input: An integer k, a set S of integers of size at most 2^, and an integer t. 
Task: Decide whether there are at most k distinct integers in S that sum up to t. 

Steiner Tree : 

Input: A graph G = (V, E) with \V\ = n, a set of t terminals T C y, a set of I non-terminals 
A?" C y, and an integer k. 

Task: Decide whether there is a subset of at most k non-terminals N' C N such that G[T U N'] 
is connected. 

Unique Coverage : 

Input: A hypergraph {V,£) with \V\ = n and \£\ = m, and an integer k. 

Task: Decide whether there exists a subset £' C E such that at least k vertices are contained 
in exactly one edge in £'. 

Weakly Chordal Deletion : 

Input: A graph G on n vertices, and an integer k. 

Task: Decide whether G has at most k vertices S such that G — S is weakly chordal. 
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